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[57] 


ABSTRACT 


An autonomous expert system for directly maintaining 
remote compufeT^systems by "directly accessing the 
remote computer systems, diagnosing, and clearing 
fault conditions on those computer systems. The, expert 
s Yfifemjerf orms those functions by first accessing a 
fault report from a centralized service reporting center, 
establishing a data connection to the computer system 
reporting the fault, invoking diagnostic routines on the 
computer system to gather data about the reported 
fault, analyzing the data, and, if appropriate, clearing 
the reported fault from the computer system. If the fault 
cannot be cleared, the expert system recommends main- 
tenance procedures and replacement parts for a techni- 
cian who the expert system dispatches to the remote 
computer. The recommendations are based on field 
experience stored in rules and databases maintained by 
the expert system. When the remote computer system is 
---■-"* stomr switching j yjteni ; (ffiX), the 

exp ert system, dnly mvokes'Tesfing^roceSu^^m the 
omputer system_which do not disru pt sta bleJelephone 
calls,, T he expert systemaccess thePBX via the public 
telephone network. 

15 Cairns, 24 Drawing Sheets 
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tern. Upon detection of a fault condition, the expert 

AUTONOMOUS EXPERT SYSTEM FOR system communicates the information to the plant oper- 

DIRECTLY M AINTA INING REMOTE ator with suggested actions to be taken. However, the 

TELEPHONE SWITCHING SYSTEMS expert system does not directly run any tests on the 

5 plant or alter the state of data within the plant Further, 

REFERENCE TO A MICROFICHE APPENDIX the system requires an unique mechanism for accessing 

This application contains a microfiche appendix, des- . toe data from the plant since the system cannot use the 

ignated A, which lists program instructions incorpo- same means to gather data as used by technicians, 

rated in the disclosed expert system. The total number The problem with prior expert systems that diagnose 

of microfiche is 2 sheets and the total number of frames 10 fault conditions in remote computer systems, is that 

is 135. they require a human technician to determine the fault 

condition or to test and retire fault alarms in those sys- 

TECHNICAL FIELD terns. Also those expert systems require special mecha- 

This invention relates to maintaining computer sys- nisms for gaining access to remote system data which 

terns and in particular to mamtaining such systems using 15 add to the operating costs, 

an expert system. SUMMARY OF THE INVENTION 

BACKGROUND OF THE INVENTION A technical advancement is achieved by an expert 

Modern private branch exchanges (PBX) use a com- system that mmntflins remote computer systems by di- 

puter to control a switching network. PBXs are also 20 rectly accessing the remote computer systems, diagnos- 

referred to as customer switching systems or private ing, and clearing fault conditions on those computer 

automatic branch exchanges (PABX). In addition to systems. The expert system performs those functions by 

controlling the PBX, the computer is continuously run- first accessing a fault report from a centralized service 

ning basic diagnostic tests not only on itself but also on reporting center, establishing a data connection to the 

the switching network and communication facilities 25 computer system reporting the fault, invoking diagnos- \ 

interconnecting the PBX to other PBXs and other types tic routines on the computer system to gather data \ 

of computer systems. In addition to permanent faults/a- ^ut the reported fault, analyzing the data, and, if i 

Iarms, these diagnostic tests fmd.many transitory faults appropriate, clearing the reported fault from the com- 

within the PBX. The^ansitory faults may indicate that ter s stem> Advantageously, if the fault cannot be 

a component of the PBX is marginally faulty or that the 30 d ^ tem recommends maintenance 

PB f S o^ 0 c nm u nt ^ ? on<htl °™ ^induced a failure dures ^ rep i acement parts for a technician who J 

in the PBX Such environmental conditions result from ^ sys tem dispatches to the remote computer. J 

a variety of sources ranging from error conditions on ^ ^^^^ are based on field experience 

the communication faculties to electrical noise in the , . , , , t . . , , _ 

uic wuiuiuiii^iuuii l ^ ura ^ w ^" vZur n „n i« stored in rules and databases maintained by the expert 

AC power supplied to the PBX at its site. Each fault 35 centralized service reporting center re- 

occurring on a PBX must be investigated by a service svstenL * ne centrauzea service reporting center re 

technician to determine the severity of the fault When <* lves ^ or da ™ s from £ e com ^ ter ^ 

a PBX manufacturer has thousands of PBXs to maintain tems '. ™* expert system accesses the reporting cen- 

in the field, the cost of making such investigations be- ter via a digital link. 

comes enormous. 40 The expert system accesses the remote computer 
Some manufacturers have equipped their PBXs to system in the same manner as a technician by placing a 
report all faults to a centralized service reporting cen- call through the public telephone switching net- 
ter. A technician at the service reporting center reviews work. After gaining access to the remote computer 
the faults reports and then remotely accesses the PBX system, the expert system mvokes test procedures to 
to determine the cause of the faults. Whereas the ability 45 obtain further data from the computer system and re- 
ef a technician to remotely maintain PBXs is an im- *>» alarms representing transitory faults. When the 
provement, the manufacturer still incurs considerable remote computer system is controlling a customer 
costs in mamtaining PBXs in the field because of the switching system (PBX), the expert system only in- 
labor cost of technicians. vokes testing procedures in the computer system which 
Expert systems have been extensively used to assist in 50 do not disrupt stable telephone calls. In addition, the 
the maintenance of remote systems by directly support- expert system is capable of mamtaining different vin- 
ing maintenance technicians. U.S. Pat. No. 4,697,243 tages of the same PBX and identifies the vintage by 
discloses an expert system which assists technicians in interrogating each PBX. 

the maintenance of elevators. In that system, an expert In addition, the expert system maintains databases in 

system ninning on a central computer leads an on-site 55 which the results of previous accesses to any individual 

technician through a diagnostic session with menus, PBX are recorded. That information is continuously 

/questions, and directions displayed Jo the technician on reused for each access to an individual PBX to diagnose 

a remote terminal. The t^hnir»iaTTrn mmnni cflt« fault alarms on that system and identify recurrent problems, 

and test data to the expert system via the terminal. The These and other features and advantages of the inven- 

expert system then diagnoses the elevator fault and 60 don will become more apparent from the following 

sends the diagnosis back to the technician who then description of an illustrative embodiment of the inven- 

repairs the elevator. tion considered together with the drawing. 

U.S Pat No. 4,517 468 discloses an ezpert system DESCRIPTION OF THE DRAWING 
executing on a central computer for collecting data 

from remote steam turbine generator power plants. 65 FIG. 1 is a block diagram of a plurality of systems 
After collecting the data from a plant, the expert system including a computer that executes an expert system 
determines if a fault condition exists in that plant by embodying the principles of the invention for main tain- 
using field knowledge incorporated into the expert sys- ing the illustrated PBXs; 
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FIGS. 2 through 24 illustrate, in flow diagram form, in FIG. 1, PBX 105 consists of switch 123, computer 

the logical flow of the expert system of Microfiche 122, tape unit 121, and DCIU 124 which interconnects 

Appendix A; PBX 105 to message center 125. The remainder of this 

FIGS. 25 through 33 provide an example of informa- description uses the following example to illustrate the 

tion displayed by the expert system; 5 operation of expert system 102. The example assumes 

FIG. 34 illustrates a portion of the SINGLE FAIL- that computer 122 has determined the existence of 

URE database; alarms with respect to computer 122, tape unit 121, and 

FIG. 35 illustrates a portion of the MULTI-FAIL- a data transmission facility such as DS-1 120 which are 

URE database; and unit-type alarms 13, 2, and 68, respectively. These 

FIG. 36 illustrates a portion of the HISTORY data- 10. alarms are illustrated in FIG. 27. 

base. Expert system 102 is constructed using a rule-based 

rvPTATT rn np^nxrinw methodology. Such a methodology allows expert sys- 

DETAILED DESCRIPTION tem 1Q2 to represent units of knowledge in the form of 

FIG. 1 illustrates expert system 102 which embodies rules which allows easy change of the knowledge repre- 

the principles of the invention for performing mainte- IS sented in each rule without disturbing the rest of the 

nance on a plurality of PBXs (105 and 114) via public system. A rule-based system consists of three compo- 

telephone network 113. Expert system 102 is executing nents: working memory, rule memory, and an inference 

on computer 100 which advantageously may be of the mechanism. The working memory describes the current 

AT&T 6300 family of personal computers. PBX 114 state of the rule-based system, and moderates all com- 

and 105 (also referred to as customer switching systems) 20 munication between rules. If a rule needs to pass values 

are telephone switching systems whose telephone to another rule, then it must do so through the working 

switching network is controlled by a stored program memory. The programmer declares items in working 

computer. Within each PBX, the computer is constantly memory using a format that is similar to type-declara- 

executing diagnostic routines checking for fault condi- tion information in standard programming languages, 

tions. For example, within PBX 105, computer 122 is 25 The rule memory is a collection of rules. Each rule 

periodically running diagnostic routines to verify not consists of a set of conditions and a set of actions. The 

only the state of switch 123 but also the state of the programmer constructs the rules so that each represents 

central office trunks such as DS-1 120, digital transmis- a functionally independent and meaningful portion of 

sion facility (DCIU) 124 and tape unit 121. If a fault is the problem solution. In addition, the rule base has 

detected with one of these units, PBX 105 records that 30 access to databases where additional knowledge and 

fault Such a fault is commonly referred to as an alarm PBX specific history information is stored. An example 

or an alarm condition and results in a call being placed of such databases is in expert system 102, the SINGLE 

by PBX 105 via the public telephone network 113 to the FAILURE and MULTI-FAILURE databases, 

service reporting center 104. Upon completion of the In procedural languages, the sequence of program 

call, computer 122 transmits the alarm information to 35 statements and explicit control statements determine 

service reporting center 104 where it is recorded. The execution order. In rule-based programming, the infer- 

process of recording alarms by service reporting center ence mechanism regulates the matching, selection and 

104 is referred to as generating a trouble report or trou- execution of rules. The inference mechanism is similar 

ble ticket to an interpreter executing the following four-step loop: 

Once service reporting center 104 has recorded the 40 (1) In the match phase, the inference mechanism 

trouble ticket in its internal database, then either a collects all rules whose conditions match the current 

human technician or expert system 102 accesses service state in working memory. 

reporting center 104 to obtain the trouble ticket. Either (2) During the select stage, the inference mechanism 
the technician or expert system 102 accesses PBX 105 selects the rule to be executed. If there is more than one 
via public telephone network 113 to run diagnostic 45 rule that matches the current state, a process called 
procedures (PROCs) on computer 122 to perform a conflict resolution specifies how rule priority is deter- 
complete diagnosis of the state of PBX 105. The access- mined. 

ing of PBX 105 for this purpose is referred to as a ses- (3) In the act stage, the actions specified by the se- 

sion. For the AT&T System 85, detailed information on lected rule are executed. This results in modifications to 

the operation of the PROCs is set forth in the manual 50 the state of working memory. 

entitled, "AT&T System 85, Release 2, Version 4, (4) After the rules actions are executed, the inference 
Maintenance." mechanism again begins the match stage. This loop 
Alter either the technician or expert system 102 has continues until no more rules match working memory, 
initiated a session with PBX 105, a listing of the alarms or until an explicit halt is encountered, 
found by computer 122 is obtained and then PROCs are 55 Expert system 102 is based on the C5 programming 
run to perform diagnostic tests and gather additional language which is similar to OPS5 programming lan- 
information concerning the nature of the alarms. Many guage designed and built by Carnegie-Mellon Univer- 
alarms can be resolved through the execution of sity. Further details concerning C5 can be found in the 
PROCs and do not require the replacement of any parts article entitled, "Rule-Based Programming in the 
within PBX 105. After miishing the session with PBX 60 Unix® System," G. T. Vesonder, AT&T Technical 
105, both the technician and expert system 102 generate Journal, Jan. -Feb. 1988, Volume 67, Issue 1. Expert 
a report indicating what alarms, if any, still exist on the system 105 is illustrated in C5 source language in Micro- 
system and a recommendation about the desirability of fiche Appendix A. 

sending a technician to the site. Expert system 102 also Expert system 102 not only incorporates engineering 

recommends what spare parts may be needed to resolve 65 and field experience within the rules of the program, but 

the remaining alarms. also in databases. In particular, the SINGLE FAIL- 

Consider now in greater detail the operations of ex- URE and MULTI-FAILURE databases are used to 

pert system 102 in maintaining PBX 105. As illustrated store recommendations on whether to replace parts 
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which have caused alarms within PBX 105. In addition, necessary output files via block 204. An example of such 

expert system 102 mamtflins ADMINISTRATION, a working memory element is the switch memory ele- 

HISTORY, and SWITCH databases. The ADMINIS- ment which contains the information illustrated in FIG. 

TRATION database contains information detailing 25 in lines 2501 and 2502. An example of an output file 

how the different components are utilized within a 5 opened by block 204 is one used to store information 

PBX. The SWITCH database records information such as illustrated in FIGS. 25 through 33. Block 205 

about configuration reported to expert system 102 by initiates database table structures, 
each particular PBX with which it has communicated. Using block 206, the expert system 102 instructs 

Finally, the HISTORY database contains a history of VMAAP 103 to place a call to PBX 105 in order to 

the alarms found and action taken for each PBX session. 10 open the session with PBX 105. Block 206 transmits a 

The HISTORY database is used to anticipate serious command to VMAAP 103 which contains the neces- 

problems within a particular PBX and to gather addi- sary information to identify PBX 105. Expert system 

tional field experience for later incorporation into the 102 waits in block 206 until it receives information back 

rules (see Microfiche Appendix A) and into the SIN- from VMAAP 103 indicating that the connection has 

GLE FAILURE and MULTI-FAILURE databases. 15 been made. Expert system 102 then requests the status: 

Expert system 102 is executed by processor 106 in of the connection via block 207. Based on the status 

computer 100 as illustrated in FIG. 1. Programs are determined by block 207, decision block 208 determines 

stored in memory 107 whereas the databases are stored if a connection has been made. If the connection status 

in disk 108. Initially, controller 101 accesses service is "Al" indicating a successful login to PBX 105, then 

reporting center 104 via modem 111 and public tele- 20 path 216 is followed to block 209. However, if the status 

phone network 113. From service reporting center 104, is not "Al", block 210 is executed via path 215. Execu- 

controller 101 obtains the trouble ticket information for don of block 210 indicates that expert system 102 was 

PBX 105. Expert system 102 then obtains the trouble unable to log on to PBX 105 via VMAAP 103. This 

ticket from controller 101. Expert system 102 then information then is displayed via block 211 (see FIG. 

opens a session with PBX 105 by accessing PBX 105 via 25 22). 

VMAAP 103, modem 110, and public telephone net- If it was possible to log on to PBX 105, expert system 

work 113. As described in greater detail with respect to 102 requests, in block 209, the customer data from PBX 

FIG. 2, expert system 102 now obtains customer data 105 via VMAAP 103. Upon receiving the customer 

and data concerning the functions of PBX 105. After data, block 209 parses this data so that it is in a usable 

obtaining this information, expert system 102 requests 30 form. 

and obtains from PBX 105 the number of alarms pres- Next, expert system 105 determines the available 

ently existing on the PBX and detailed information modes of operation and switch parameters of PBX 105. 

about the unit-type and location of each alarm. This is accomplished in FIG. 3. Blocks 301 through 307 

For the present example, this information is illus- determine what modes are implemented and available, 

trated in FIG. 27. A unit-type 2 alarm indicates that 35 The modes include the administration, maintenance, 

there is a tape unit problem. A unit-type 68 alarm indi- and tape modes. The modes are required in order to 

cates that there has been an error on a data transmission avoid interaction problems with other entities that may 

facility such as DS-1 120, and a unit-type 13 alarm indi- also be doing work on the PBX, such as an on-site 

cates a failure on a port data section of the memory of craftsperson. The administration mode allows adminis- 

computer 122. As will be described in greater detail 40 tration of the different characteristics of the PBX such 

with respect to FIGS. 13 and 14, expert system 102 as which telephone numbers are assigned to which 

performs the appropriate PROCs with respect to tape physical ports. The maintenance mode allows the differ- 

unit 121 and determines that the unit-type 2 alarm can- ent maintenance procedures to be executed. The tape 

not be cleared since the trouble/fault still exists on tape mode allows the administration data stored on-line in 

unit 121. Then by utilizing the data within SINGLE 45 the PBX's memory to be transferred to tape unit 121. 
FAILURE database, expert system 102 recommends The first decision that must be made is whether the 

that a service technician be dispatched to PBX 105 with PBX 105 is of a version that requires the modes. This is 

a new tape cartridge to replace the existing one. Expert done by block 301. Certain earlier versions of PBX 105 

system 102 next analyzes the unit-type 13 and 68 alarms did not have modes since only one entity at a time could 

by executing the appropriate PROCs and utilizing the 50 access the PBX. If the decision is made in decision block 

SINGLE FAILURE and MULTI-FAILURE data- 301 that the PBX under test does have modes, block302 

bases. Expert system 102 will successfully cause PBX re^uesls^the data or* which mc^lesar^jyaijajile/Nex^ 

105 to recover from these two alarms. Wock3&check» if LheiBiimtenancTmode is available 

In addition, during each session, expert system 102 M the maintenance mode is not available, path 325 is \ 
performs preventive maintenance with respect to PBX 55/ followed to block 305 where the recommendation is set 

105 by determining whether computer 122 has under- / so that the message "maintenance mode not available" . 

gone any initializations. Based on a examination of these I is displayed during the displav recommendation porti on / 
initializations using the HISTORY database, expert \of the session by block 21 IjTrtEe^ maintenance models^ 

system 102 will recommend whether a service techni- not available, exjiert system 102 must stop the session 

cian should be dispatched to PBX 105 to perform speci- 60 with PBX 105 at this point since it cannot proceed 

fled service procedures which can include the part re- without itself setting the maintenance mode. If the 

placement maintenance mode is available, decision block 303 via 

FIG. 2 illustrates the initial setup and logging on to path 326 checks whether the administration mode is 

PBX 105 by expert system 102 via VMAAP 103. Ini- available. If the administration mode is available, then 

tially, expert system 102 obtains information concerning 65 block 307 via path 328 sets both the administration and 

the trouble ticket from controller 101; Upon obtaining maintenance modes. If the administration mode is not 

this information, expert system 102 creates initial work- available, path 327 is followed from decision block 304 

ing memory elements using block 203 and opens the to block 308 which sets the maintenance mode only. If 
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the administration mode is not available, the session the latest configuration that expert system 102 has en- 
does not stop since expert system 102 can perform lim- countered with respect to PBX 105. After this has been 
ited maintenance functions without the administration determined, block 405 is executed which determines (on 
mode. the basis of the information received and, as shown in 

Blocks 310 through 321 determine whether computer 5 FIG. 25 in lines 2501 and 2502) whether the present 
105 in PBX 105 is duplicated, e.g., has two processors. configuration of PBX 105 represents a change from its 
One processor is online and the other one is offline last recorded configuration. If there has been a change, 
waiting to be brought online if the current online pro- block 407 via path 420 creates a new database entry for 
cessor fails. If PBX 105 has duplicated processors, it is this PBX. If there has not been a change, path 421 is 
necessary to test both of them. Two Procedures 10 followed to decision block 410. 
(PROCs) perform the duplication testing. These Blocks 410 to 413 are concerned with obtaining the 
PROCs are detailed in the aforementioned AT&T local time maintained by PBX 105. The local time is 
Maintenance Manual. First, PROC 275 is executed by important since PBX 105 may be located anywhere in 
block 309. Decision block 310 checks the information the world, and the PBX alarms which will be discussed 
returned by PBX 105 in response to PROC 275. If no 15 later are time-stamped relative to this local time. If 
error code is returned by PBX 105, path 332 is followed decision block 410 determines that the administration 
to decision block 316. If an error code of "3" is re- mode has been set by expert system 102, path 423 is 
turned, expert system 102 determines that an error may followed to blocks 412 and 413 which obtain the local 
have occurred, and block 311 once again executes time from PBX 105. Block 414 then displays the local 
PROC 275 in PBX 105. Decision block 313 checks the 20 time as indicated in line 2503 of FIG. 25. If the adminis- 
results of the execution of block 311; and if no error tration mode is not available, path 422 is followed to 
code is returned, path 334 is followed to decision block connector 416. 

316. If decision block 310 finds an error code other than Having obtained the information defining the system 
"3" or if decision block 313 finds any error code, paths parameters of PBX 105, expert system 102 now obtains 
330 or 335, respectively, are followed from these deci- 25 an overall view of the maintenance condition of PBX 
sion blocks to block 314. Block 314 executes PROC 613. 105 by execution of PROC 600. Execution of PROC 
The results of the execution of PROC 613 are checked 600 on PBX 105 obtains the number of alarms on the 
by decision block 319. If no error code is returned, path PBX and then requests are made for detailed informa- 
340 is followed to block 318 since this indicates that tion about each alarm. Block 501 transmits the PROC 
PROC 613 has determined that computer 122 is dupli- 30 600 request to PBX 105 via VMAAP 103. PBX 105 
cated. If error code 74 is returned, path 342 is followed responds to this request with the number of alarms 
to block 317 since this indicates that PROC 613 has which are outstanding within the PBX. This number is 
determined that computer 12? is not duplicated. If any read by block 502, and decision block 503 makes a de- 
other error code is returned, path 341 is followed to termination of whether any alarms exist on PBX 105. If 
block 320 which indicates that the state of duplication is 35 no alarms exist, path 522 is followed to block 506 which 
unknown. proceeds to check the off line processor, if any, for 

If no error code was returned from the execution of alarms. 
PROC 275 in block 309, then path 332 is followed from If alarms exist, then path 523 is followed to block 504 
decision block 310. Decision block 316 examines field which displays lines 2701 of FIG. 27. Expert system 102 
10 of the information returned by PBX 105 in response 40 obtains information concerning each of these alarms by 
to PROC 275. This field indicates whether computer the repetitive transmission of the "next circuit" com- 
122 is duplicated. If computer 122 is not duplicated, mand of PROC 600 by block 507. As information about 
then block 317 marks it as such. If PROC 275 indicates each alarm is received, it is displayed by block 508 to 
that the processor is duplicated, block 318 marks the create each line in lines 2702 of FIG. 27. Warning 
system as having a duplicated processor. 45 alarms and the trunk software alarms identified by deci- 

The information displayed in lines 2501 of FIG. 25 sion blocks 510 and 512, respectively, are immediately 
was obtained in block 209. Blocks 309 through 321 cleared by block 511 via paths 524 and 527, respec- 
obtained line 2502. tively. The sequence of block 504 through 512 is termi- 

FIG. 4 illustrates the additional administrative tasks nated by decision block 514 after information on all the 
that are performed before the testing of PBX 105 can 50 alarms has been obtained. As long as there is still an 
commence. Blocks 401 through 409 check the outstanding alarm, path 530 is followed back to block 
SWITCH database to determine whether PBX 105 has 507. After information about all alarms has been ob- 
had maintenance performed on it in the past by expert tained from PBX 105, path 531 is followed to decision 
system 10Z First, block 401 queries the SWITCH data- . block 515. Blocks 515 through 518 account for the fact 
base to determine if it contains any data with respect to 55 that DCIU 124 and tape unit 121 within PBX 105 can 
the PBX under test Decision block 402 then determines each have multiple alarms. Since the diagnostic tests 
the number of entries. If the PBX has not previously performed for these units are complete, it is desirable 
been encountered before, path 418 is followed to blocks only to perform each test once per session. Hence, 
407 and 409 which build an entry in the SWITCH data- blocks 515 through 518 remove all but at most one 
base for this PBX. If paths 419 or 420 are followed, the 60 alarm for DCIU 124 and tape unit 121. 
switch has previously had maintenance performed on it There are three overall goals that must be achieved in 
However, what must still be determined is whether the the execution of each of the diagnostic PROCs. First, 
PBX's configuration has changed; and if it has changed expert system 102 determines the failures that caused 
what the newest configuration is. each alarm by executing diagnostic routines on PBX 

Path 420 is taken if the PBX has only one recorded 65 105. Secondly, expert system 102 generates a report 
configuration. If path 419 is followed indicating the detailing the results of the diagnostic routines and indi- 
existence of more than one past configurations, then it is eating the remaining alarms on the system. Finally, 
necessary to determine the newest entry which defines expert system 102 provides recommendations for re- 
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placing hardware on PBX 105 if necessary. An example replace the tape cartridge. In response to the execution 

of such a recommendation is illustrated in FIG. 33 of PROC 610, decision block 1308 determines whether 

where it is recommended to replace the tape cartridge tape unit failures have occurred. If no tape unit failures 

of PBX 105. are outstanding, blocks 1309 and 1310 are executed via 

FIG. 6 is concerned with the order in which the 5 path 1320. These blocks note that the alarm has been 
alarms obtained in FIG. 5 should be processed. The cleared, and a command is sent to PBX 105 to clear the 
aforementioned AT&T Maintenance Manual indicates indication in the PBX. In FIG. 14, blocks 1401 through 
that the alarms should be processed in the order in 1403 collect all the failures associated with the unit-type 
which they are received upon execution of PROC 600. alarm 2 on PBX 105. q 
This order is given by the entry index number as illus- 10 FIG. 7 illustrates the logical flow performed in utiliz- 
trated in FIG. 27. However, field knowledge obtained. ing PROC 620 to determine the cause of a network 
and incorporated into expert system 102 indicates that if alarm. First, , logical decision block 701 ascertains 
both unit-type 11 and 23 alarms are encountered, then whether there are any such alarms. If there is an out- 
the alarm with a unit type of 23 should be processed standing network alarm, then control is passed to block 
first Decision block 601 and block 602 accomplish this. IS 703 via path 729. Block 703 first obtains the unit-type 
Similarly, experience has shown that if unit-type 55, 56 and location of the failing network unit by execution of 
and 57 alarms are present together, then the alarms PROC 620. Decision blocks 705 through 709 check 
should be processed in descending order by unit type whether there are special cases which make it undesir- 
(i.g. 57 then 56 then 55). This is accomplished by blocks able to execute the diagnostic portion of PROC 620. 
603 and 605. Otherwise, if none of these special cases 20 Decision block 705 determines whether there are inter- 
are encountered, block 606 simply chooses the unpro- module calls that could be dropped if the diagnostic 
cessed alarm with the highest index number. portion of PROC 620 is executed. If intermodule calls 

Decision block 607 results in the execution of the could be dropped, control is transferred to block 710 via 

diagnostic PROCs detailed in FIGS. 7, 9, 11, 12, 13, and path 716. Decision block 706 checks whether the failing 

15. It is important to remember expert system 103, 25 network unit is the attendant console interface (unit- 

whose overall logical flow is illustrated in FIGS. 2 type 44). If the attendant console interface is failing, this 

through 24 is implemented in a rule-based language (see test cannot be performed since to properly perform the 

Microfiche Appendix A.) For more complete details on test, the attendant console headset must be unplugged 

how each diagnostic routine is implemented, the code which requires a craftsperson on site. If the unit failing 

starting at page 36 and entitled, "Referred PROCs" in 30 is unit-type 44, path 718 is followed to block 710; if not, 

Microfiche Appendix A should be examined. path 719 is followed to decision block 707. The latter 

The following is a description of the operation of decision block checks whether the alarm is of unit-type 

each of these PROCs. Note that in our present example 45 which indicates an auxiliary trunk circuit pack. In 

alarm information received from PBX 105 as illustrated order to test that circuit pack, the DIP switches must be 

in FIG. 27 indicates the existence of alarms of unit-types 35 set to a particular setting which is impossible to verify 

2, 68, and 13. Unit-type 2 alarms indicate a problem remotely. If the alarm is of unit-type 45, path 720 is 

with the tape unit Unit-type 68 alarms indicate that followed to block 710, otherwise path 721 is followed to 

there has been an error on a data transmission facility decision block 708. 

such as DS-1 120. Unit-type 13 alarms indicate a failure Decision block 708 determines whether the failing 

in the port data section of the memory of computer 122. 40 circuit pack has been marked as "maintenance busy^ 

However for unit-type 13 alarms, field experience in- indicating that there is a craftsperson on site performing 

corporated into expert system 102 has shown that a maintenance tests on this particular circuit pack. If the\ 

variety of different equipment failures within PBX 105 circuit pack is marked as maintenance busy, path 722 is\ 

may result in that type of alarm. Those failures will be followed to block 710; otherwise, path 723 is followed 1 

further detailed during the discussion of unit-type alarm 45 to decision block 709. Decision block 709 checks a"S 

13. number of special situations where stable calls could be 

FIG. 13 shows the logical operations performed by dropped/disconnected if the diagnostic portion of 

expert system 102 in response to a unit-type 2 alarm. PROC 620 is executed. If stable calls could be dropped, 

This alarm indicates that an error has been encountered path 724 is followed to block 710 since it is undesirable 

on tape unit 121. Decision block 1301 represents the 50 to perform a test that could potentially drop calls. If the 

logical select stage of rule-based expert system 103 for a testing would cause no stable calls tojz&diapped, path 

unit-type alarm 2. 725 is followed to connector 712. Block 710 marks the 

Logical blocks 1302 through 1306 are concerned alarm as "uncleared," which terminates the session, 

with versions of PBX 105 which require modes. If PBX Connector 711 interconnects to a portion of the pro- 

105 is of a non-mode version, then path 1315 is followed 55 gram which obtains the board type. This portion is 

to block 1307. If modes are required, then blocks 1303 executed so that the board can be checked to insure that 

and 1304 determine whether the tape mode is available it is of the proper vintage and is properly administered, 
to expert system 102. If the tape mode is not available, FIG. 8 illustrates another special case where the diag- 

path 1317 is followed from decision block 1304 to block nostic portion of PROC 620 often cannot be run, Unit- 

1305 which marks the alarm as "uncleared/'Marking 60 type 68 alarms indicate a failing DS-1 trunk unit such as 

the alarm as uncleared causes a message indicating that unit 120 which has 23 separate channels, each capable of 

fact to be printed. If the tape mode is available, then . carrying a telephone conversation. Since the probabil- 

block 1306 is executed via path 1318 to set the tape ity is extremely high that at least one of these channels 

mode. will be active at any given point in time,, a non-invasive 

Because of the nature of tape unit 121, failure status 65 PROC (PROC 625) is used to investigate the status of 

information only is collected from this unit and no diag- this particular facility. Decision block 801 checks if the 

nostic tests are actually run on it. However, as will be failing facility is of unit-type 68; and if it is, path 815 is 

illustrated later, a recommendation may be made to followed to block 802. Blocks 802 through 811 utilize 
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PROC 625 to check the status of the failing DS-1 trunk cution of the diagnostic portion of PROC 650 is be- 

unit If the failing facility is not of unit-type 68, then tween 19 and 36. If the returned code is in that range, 

path 816 is followed to block 805 which executes the then field experience has shown that PROC 650 should 

diagnostic portion of PROC 620. be re-executed since those alarms may clear themselves. 

After execution of the diagnostic portion of PROC 5 This is performed by following path 924 to block 916. If 

620; decision block 803 checks if the alarm status after the returned fault code is not in that range, path 925 is 

execution of block 805 indicates that the alarm had been followed to block 912. After re-execution of PROC 650 

cleared. If the alarm has been cleared, path 817 is fol- in block 916, decision block 917 determines whether the 

lowed to block 812. If the alarm has not been cleared, alarm has been cleared on the second pass. If the alarm 

path 820 is followed to decision block 804. Decision 10 has been cleared, path 929 is followed to block 918 

block 804 checks whether the alarm after execution of which performs a function similar to that performed by 

PROC 620 has changed from a unit-type 56 or 57 alarm block 909. If the alarm has not been cleared, decision 

(intramodule data store or light guide interface fault) to block 917 transfers control via path 928 to block 912. 

unit-type 23 alarm (duplication channel fault) If this This latter block in conjunction with decision block 913 

change has occurred, field experience has found that the 15 utilizes the "next fault" command of PROC 650 to ob- 

unit-type 23 alarm must first be cleared before any other tain all of the outstanding fault information. Once all the 

alarms can be processed. The unit-type 23 alarm is fault information has been collected, path 931 is fol- 

cleared by following path 821 to decision block 806 lowed to block 914 which marks the alarm as "un- 

which via path 818 re-executes PROC 620 on the dupli- cleared" and transfers to connector 910. 
cation channel via block 805. 20 FIG. 10 is a continuation of FIG. 9. Blocks 1001 

If there has not been a change in the unit-type of the through 1007 obtain information to make up a report 

alarm, decision block 804 transfers control to block 807 similar to FIG. 28. Block 1001 is used to execute PROC 

via path 822. Block 807 marks the alarm as "uncleared" 256 to determine additional information about the data 

and transfers control to connector 711 to obtain the link. Decision block 1002 determines whether an error 

board type. This is done so that a replacement recom- 25 has occurred during the execution of PROC 256. If 

mendation can be made. there is an error, path 1011 is followed to connector 702 

After execution of PROC 625 by block 802, decision which results in error processing as illustrated in FIG. 

block 809 is executed to determine whether the test 7. If an error did not occur, path 1012 is followed to 

indicated that the DS-1 trunk facility is failing. If the block 1003. Block 1003 once again uses PROC 256, but 

facility indicates no failures, then block 811 is executed 30 this time specifies the failing data link. Further informa- 

to clear the alarm in PBX 105 via path 823. If the facility tion about that link is obtained using the display portion 

indicates a problem, path 822 is followed to block 810 of PROC 256 in block 1004. After execution of this 

which records the problem. The information recorded block, the specified information is read using block 

in block 810 is utilized to print the information of FIG. 1005. Then block 1006 reads the translation tables in the 

29, line 2902. 35 memory of PBX 105 to determine the nature of the 

In the present example, FIG. 27 illustrates the results applications assigned to the logical channels of the indi- 

of executing PROC 600 to obtain the initial alarms of cated data link to PBX 105. For example, the same 

PBX 105. The third line of block 2702 indicates that computer can run applications to function either as a 

there is another failure (unit-type 13 alarm, port data message center or as a telemarketing center. Each appli- 

storage unit) which requires the execution of PROC 40 cation is assigned one or more logical channels on the 

620. When PROC 620 is executed to investigate this physical data link between the computer and PBX 105. 

particular alarm at decision block 805, path 816 is fol- After executing block 1007, connector 811 transfers 

lowed to block 803 which checks the result. Then, since control to the procedures that obtain the board type, 
the port data unit in the present example shows no fail- FIG. 11 illustrates the logic flow of PROC 601. This 

ures, control follows path 817 to block 812. Block 812 45 PROC obtains information concerning the units in PBX 

marks this alarm as "cleared" and transfers control to that control the physical environment, e.g., fans, power 

the "get board-type" procedures via connector 711. supplys, battery back-up units, etc. Decision block 1101 

Therefore, the results of executing PROC 620 as dis- first determines whether this PROC is to be run. If so, 

played in line 2903 of FIG. 29 indicate that the unit-type path 1111 is followed to block 1103 which executes 

13 alarm has been cleared. 50 PROC 601. Decision block 1104 checks if the alarm has 

FIG. 9 illustrates the logical flow for PROC 650. This' been marked clearly by PBX 105. If it has, path 1112 is 
PROC is used to test DCIU 124 of PBX 105. This data followed to block 1105 and from there to connector 
transmission facility has eight distinct data links. Each 702. If the alarm is not cleared, path 1113 is followed to 
data link interconnects PBX 105 to a computer or com- block 1106. The latter block takes the information re- 
munications systems. Examples of such systems are 55 turned from PBX 105 as a result of the execution of 
voice mail and message center systems. PBX 105 is PROC 601 and computes a pseudo-fault code that sum- 
connected only to message center 125. First, PROC 650 marizes information concerning the failure that was 
(test 1) is utilized to determine which of these data links identified. This pseudo-fault code records the possible 
is failing. This information is read from PBX 105 utiliz- multiple causes of the alarm in the history database, 
ing block 904. Block 905 is then utilized, to execute the 60 After execution of block 1106, block 1107 is executed to 
diagnostic portion of PROC 650 (test 2) to perform mark the alarm as uncleared, and control is passed to 
transmission testing on that data link. Decision block connector 702. 

908 then checks if the results of the diagnostic portion FIG. 12 illustrates the logical flow of PROC 622. 

of PROC 650 indicate that the transmission test was This latter PROC tests peripheral equipment attached 

successful. If so, then blocks 909 and 910 are executed 65 to PBX 105, such as telephone stations and data termi- 

via path 922. If the transmission test failed, path 923 is nals. Block 1202 executes the diagnostic portion of 

followed to decision block 911. The latter decision PROC 622 after providing the type and location of the 

block determines if the fault code returned by the exe- failing peripheral equipment. PROC 622 runs the diag- 
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nostic portion of the test several times on the failing through 1705 illustrate the logical flow for printing the 

peripheral unit to fully evaluate the state of the unit information for PROC 620. Block 1701 determines if 

Decision block 1203 determines whether the unit failed there is any information to be displayed for PROC 620. 

under test If the indicated unit did not fail, path 1213 is If there is no information to be displayed, path 1710 

followed to block 1204 which marks the alarm as 5 transfers control to block 1706. If there is information to 

cleared If the unit failed under test, control is trans- be displayed, control is transferred , to block 1702 via 

ferred via path 1214 to block 1205. Blocks 1205 and path 1711. The latter block prints the display header and 

1206 obtain detailed results of the multiple diagnostic then transfers control to blocks 1703 through 1705. 

tests performed on the peripheral unit After all this These blocks display the data on , 2901 through 2903 of 

information has been obtained, path 1216 is followed to 10 FIG. 29. Similar logical flow as used by blocks 1702 

block 1207 which marks the alarm as uncleared and through 1705 is utilized in block 1706 for printing infor- 

control is transferred to connector 711. mation gathered by the execution of PROCs 650, 601, 

FIG. 15 illustrates the logical flow of PROC 612. 622, 610, and 612 provide for PROC 620. After execu- 

This latter PROC is executed every time that expert tion of block 1706, control is transferred to connector 

system 102 contacts a PBX such as PBX 105. PROC 612 15 1707. 

interrogates the PBX to determine whether the PBX FIG. 18 illustrates a check for reniaining alarms and 

has undergone any software initializations. For exam- the display of alarm information which, for the present 

pie, software initializations occur if the program exe- example, results in the generation of FIG. 31. The alarm 

cuted by PBX 105 is repeatedly interrupted by a parity state of the PBX is again checked to ensure that it is 

error or programming problem. The information ob- 20 consistent with the text results seen by expert system 

tained by PROC 612 is utilized to predict in advance 102. In the present example, unit-type alarm 13 should 

whether a particular PBX is approaching a critical point be cleared; however, unit-type alarms 2 and 68 should 

where it may have a severe service outage. If such a still remain on PBX 105. Block 1801 executes PROC 

situation is detected, a craftsperson may be dispatched 600 to interrogate PBX 105 regarding alarms existing on 

to prevent an actual outage from occurring. 25 that PBX. Decision block 1802 checks if there are any 

Block 1501 executes PROC 612; and block 1502 ob- remaining alarms. If there are no remaining alarms, 

tains information about any initialization causes appear- control is passed to connector 1808 via path 1811. If 

ing in the PBX's log. Decision block 1503 looks at the there are reniaining alarms, block 1803 is executed via 

resulting fault codes to determine their seriousness. If a path 1812. This latter block displays the header informa- 

serious fault code is detected, then path 1508 is followed 30 tion illustrated in FIG. 31 for the present example, 

to block 1504. This latter block makes a record to track Blocks 1805 through 1807 then determine what alarms 

these conditions in the INTT database maintained by are present on PBX 105 and display this information as 

expert system 102. An example of a minor fault is an illustrated in FIG. 31. After all the alarms have been 

on-site craftsperson who simply stopped the PBX pro- displayed, control is passed via path 1814 to connector 

cessor to perform maintenance functions. Decision 35 1808. 

block 1505 insures that all initialization log entries have FIG. 19 illustrates the logical flow of checking the 

been checked. After all entries have been checked, off-line processor of PBX 105. First, decision block 

decision block 1505 transfers control via path 1511 to 1901 checks if computer 122 is duplicated. If it is not, 

connector 1506. path 1915 is followed to connector 1907 which executes 

FIG. 16 illustrates the logical flow for obtaining the 40 the program segment which recommends which hard- 
board or circuit pack type. PROC 600 may provide ware, if any, should be replaced on PBX 105. If corn- 
incomplete information concerning the location of the puter 122 is duplicated, decision block 1903 is executed 
failing circuit board. Block 1601 determines whether via path 1916. Before maintenance, administration, and 
the location of the failing init is completely known. If tape functions can be accessed in this PBX, decision 
the location of the failing circuit board is not com- 45 block 1903 checks whether PBX 105 requires mode 
pletely known, control is transferred to connector 702, permission. If modes are not required, path 1917 is fol- 
If the location is known, decision block 1602 is executed lowed to block 1909. If modes are required, block 1904 
to determine whether the administration mode has been is executed via 1918. Block 1904 requests the mode data 
set If the administration mode has not been set, then the from the off-line processor of PBX 105. Note that the 
board type cannot be determined since this information 50 mode data for the on-line processor of PBX 105 was 
is stored within PBX 105 and the adininistration mode is previously obtained in FIG. 3. Decision block 1905 
necessary to obtain that information. If the administra- checks whether the off-line maintenance mode is avail- 
tion mode has been set, then block 1603 is executed. The able. If the maintenance mode is not available on the 
latter block executes PROC 290 on PBX 105 to obtain off-line processor of PBX 105, then control is trans- 
the board type and board vintage from the PBX. This 55 ferred via path 1919 to block 1906. This transfer results 
information is read by block 1604. Next the ADMIN- in the recommendation being set to display the fact that 
ISTRATION database of expert system 102 is interro- the off-line maintenance mode is not available.. Then, 
gated to ascertain whether the board is correctly admin- control is passed to connector 1907. If decision block 
istered on PBX 105. The decision of whether the admin- 1905 determines that the maintenance mode is available, 
istration is correct is performed in decision block 1607. 60 control is transferred via path 1920 to block 1908 which 
If the board is incorrectly administered, block 1608 is sets the maintenance mode in PBX 105. Next, block 
executed to highlight this fact so that a craftsperson can 1909 executes PROC 600 and block 1910 obtains the 
readminister the board within PBX 105. Lastly, control number of outstanding alarms on the off-line processor 
is passed to connector 702. of PBX 105. Decision block 1911 determines whether 

FIG. 17 illustrates the logical flow for printing the 65 there are any outstanding alarms on the off-line proces- 

information gathered by the diagnostic PROCs for this sor of PBX 105. If there are no outstanding alarms, 

session with PBX 105. See FIGS. 28 through 30 for an control is transferred to connector 1907 via path 1921 so 

example of this printed information. Blocks 1701 that the "recommend hardware" portion of the pro- 


10/09/2003, EAST Version: 1.04.0000 


15 


4,972,453 


16 


10 


15 


gram can be executed. If there are alarms on the off-line 
processor, then control is transferred via path 1922 to 
connector 1912 so that these off-line alarms can be dis- 
played and eventually the "recommend hardware" por- 
tion of the program is executed. 

FIG. 20 illustrates the logical flow for displaying the 
results of the FBX 105 interrogation performed by the 
logical flow illustrated in FIG. 19. The logical flow of 
blocks 2001 through 2003 and block 2006 is similar to 
that of FIG. 17. The difference between FIGS. 17 and 
20 is the actions taken by blocks 2004 and 2005. These 
two blocks determine if any alarms noted on the off-line 
processor of PBX 105 had been previously encountered 
during testing of PBX 105's on-line processor. If an 
alarm had been previously found during testing of the 
on-line processor, then that alarm is cleared in the off- 
line processor by block 2005 since the alarm probably 
resulted from an unduplicated portion of the system 
which was reported to both processors. After all of the 
off-line alarms have been displayed, control is trans- 20 
ferred to connector 1907 via path 2013. The present 
example assumes that the off-line processor of PBX 105 
had no alarms and results in the display illustrated in 
FIG. 32. 

FIG. 21 illustrates the logical flow of the portion of 25 
the program that determines what replacement parts, if 
any, should be recommended. A service technician is 
dispatched to take those parts and to perform the neces- 
sary maintenance on PBX 105 to clear the alarms re- 
maining after expert system 102 has finished the session 
with PBX 105. First, decision block 2101 checks if there 
are any environmental alarms. If there are environmen- 
tal alarms, then control is transferred to block 2102 via 
path 2111. The reason is that an environmental alarm in 
a cabinet often results in spurious reports of other hard- 
ware failures within that cabinet. An example of an 
environmental condition is an over temperature alarm. 
When circuit packs are operated outside of their recom- 
mended operating temperature range, the packs exhibit 
error conditions that disappear when normal conditions 40 
are restored. Therefore, no hardware recommendations 
are made for replacement of boards operating under 
these conditions. 

After block 2102 has been performed or if there are 
no environmental alarms, decision block 2103 is exe- 45 
cuted. The latter decision block determines whether 
there are any remaining alarms that are not in a cabinet 
exhibiting environmental alarms. If there are no such 
alarms, then control is transferred to connector 2104 via 
path 2113 and no recommendations will be made. If 50 
there are alarms which are not in a cabinet that has an 
environmental alarm, control is transferred to block 
2105 via path 2114. Block 2105 picks a particular alarm. 
The present example uses the unit-type 2 alarm. Next, 


30 


35 


occurred enough times to warrant a hardware replace- 
ment. First, decision block 2201 checks if the unit-type 
alarm and fault code appears in the MULTI-FAILURE 
database, a sample entry of which is illustrated in FIG. 
35. If there is an entry for the alarm under investigation 
within the latter database, path 2211 is followed to deci- 
sion block 2202. The latter decision block first obtains 
from the HISTORY database the number of occur- 
rences and time period of this particular alarm in PBX 
105. This information is compared the MULTI-FAIL- 
URE database record to determine if there have been 
enough identical failures within the specified time inter- 
val to exceed the threshold set in the MULTI-FAIL- 
URE database. If this threshold is exceeded, then block 
2203 is executed via path 2214. Block 2203 displays the 
replacement equipment recommendations obtained 
from the MULTI-FAILURE database. Blocks 2204 
through 2206 governs a condition which field experi- 
ence has shown to require special handling. The situa- 
tion arises when multiple unit-type 13 alarms indicating 
failure of several port data store units appear within the 
same module. This condition does not indicate that the 
circuit packs containing the port stores should be re- 
placed but rather that the time slot interchange arithme- 
tic logic unit in this module is at fault and should be 
replaced. If decision block 2204 finds that the alarm is a 
unit-type 13 alarm, then decision block 2205 via path 
2216 checks if other unit-type 13 alarms exist in the 
same module. If multiple alarms of this unit-type exist, 
then block 2206 is executed via path 2219. The latter 
block displays the recommendation that the time slot 
interchange arithmetic logic unit should be replaced. 
Finally, block 2207 is executed which marks the alarm 
as having been checked for a recommendation; and 
control is then transferred to connector 2211. 

FIG. 23 illustrates the logical flow for updating the 
HISTORY database which contains information on the 
alarms handled by expert system 102 on PBX 105. To be 
included in the HISTORY database, the new alarm 
must have a time of occurrence distinct from any other 
occurrence of the same type and for the same facility 
already recorded in the HISTORY database. This time 
of occurrence is determined by the time stamp informa- 
tion received from PBX 105 by PROC 600 and is dis- 
played in FIG. 27 in lines 2702 under the day, hour and 
minute columns. If decision block 2302 determines that 
a new entry should not be created, path 2308 is followed 
to block 2304. If a new entry is required, block 2303 is 
executed via path 2309. Block 2303 creates a new re- 
cord for this alarm in the HISTORY database. If a 
particular component or circuit pack is failing routinely, 
then there will be multiple entries for that unit within 
the HISTORY database. Block 2304 marks the alarm as 


decision block 2106 checks if this alarm has an entry in 55 checked and decision block 2305 determines whether 


the SINGLE FAILURE database illustrated in FIG. 34 
for the present example. Since in the present example 
the unit-type 2 alarm, indicating tape unit 121, with a 
fault code of 925 is found within this database, control is 


there are any remaining unchecked alarms for this ses- 
sion. If there are no remaining alarms, then path 2311 is 
followed to connector 2306. 
FIG. 24 shows the final steps performed to end the 


passed via path 2116 to block 2107 which displays the 60 current PBX 105 session. Block 2401 indicates that all 


recommendation from the SINGLE FAILURE data- 
base. In the present example, the recommendation is 
that the tape cartridge should be replaced (see FIG. 33.) 
For a unit-type alarm and fault code not found in the 
SINGLE FAILURE database, control is transferred to 
connector 2108 via path 2115. 

FIG. 22 illustrates the logical flow for checking 
whether the alarm condition is a transient one that has 


65 


files are closed and the proper steps taken to exit from 
this session by expert system 10Z 

It is to be understood that the above-described em- 
bodiment is merely illustrative of the principles of the 
invention and that other arrangements may be devised 
by those skilled in the art without departing from the 
spirit and scope of the invention. 
We claim: 
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1 A method for remotely main taining computer sys- 
tems by an expert system in conjunction with a central 
reporting center to which said computer systems report 
self-detected faults, comprising the steps of: 

accessing said central reporting center by said expert 
system to obtain the identity of one of said com- 
puter systems reporting detected faults; 

opening a maintenance session with said identified 
computer system by said expert system via the 
public telephone network in a similar manner as a 
human technician; 

invoking diagnostic procedures on said identified 
computer system by said expert system to gather 
data about said reported faults; 

analyzing said data to determine the severity of each 
of said reported faults by said expert system with 
respect to said reported faults being transitory and 
permanent type of faults; and 

clearing transitory ones of said reported faults by said 
expert system upon said transitory ones of said 
reported faults being determined to be less severity 
than permanent ones of said reported faults. 

2. The method of claim 1 further comprising the step 
of establishing a plurality of databases containing field 
experience on components in each of said computer 25 
systems concerning replacement of said components; 

interrogating said plurality of databases for each of 
said permanent faults to determine when to replace 
components in said identified computer system; 
and 

displaying a message defining each of said compo- 
nents to be replaced. 

3. The method of claim 2 further comprises the step 
of interrogating said plurality of databases by said ex- 


10 


15 


20 


30 


experience on components in each of said switching 
systems concerning replacement of said components; 
interrogating said plurality of databases for each of 
said permanent faults to deterrnine when to replace 
components in said identified switching system; 
and 

displaying a message defining each of said compo- 
nents to be replaced. 

6. The method of claim 5 further comprises the step 
of interrogating said plurality of databases by said ex- 
pert system to determine the number of fault occur- 
rences of each of said components having a transitory 
fault in said identified switching system; . 

interrogating said plurality of databases to determine 
whether said number for each of said components 
exceeds a predefined threshold; and 

recommending replacement of each of said compo- 
nents whose number exceeds said predefined 
threshold. 

7. The method of claim 6 wherein said identified 
switching system has a control computer including 
duplicated processors with one of said processors ac- 
tively controlling said identified switching system and 
the other of said processors being in a standby condi- 
tion, the method further comprising the step of testing 
said other processor in said standby condition. 

8. The method of claim 7 wherein said switching 
systems are of different manufactured vintages and said 
invoking step comprises the step of requesting from said 
identified switching system the vintage of said identified 
switching system thereby being able to utilize the cor- 
rect diagnostic procedures. 

9. A method for remotely determining replacement of 


pert system to determine the number of fault occur- 35 components in telephone switching systems by an ex- 


40 


45 


rences of each component having a transitory fault in 
said identified computer system; 
interrogating said plurality of databases to determine 
whether said number for each of said components 
exceeds a predefined threshold; and 
recommending replacement of each of said compo- 
nents whose number exceeds said predefined 
threshold. 

4. A method for remotely maintaining telephone 
switching systems by an expert system in conjunction 
with a central reporting center to which said switching 
systems report self-detected faults, comprising the steps 
of: 

accessing said central reporting center by said expert 
system to obtain the identity of one of said switch- 50 
ing systems reporting detected faults; 

opening a maintenance session with said identified 
switching system by said expert system via the 
public telephone network in a similar manner as a 
human technician; 

invoking , diagnostic procedures on said identified 
switching system by said expert system to gather 
data about said reported faults; . 

analyzing said data . to deterrnine the severity of each 
of said reported faults by said expert system with 60 
respect to said reported faults being transitory and 
permanent type of faults; and 

clearing transitory ones of said reported faults by said 
expert system upon said transitory ones of said 
reported faults being determined to be less severity 65 
than permanent ones of said reported faults. 

5. The method of claim 4 further comprising the step 
of establishing a plurality of databases containing field 


55 


pert system in conjunction with a central reporting 
center to which said switching systems report self- 
detected faults, comprising the steps of: 
maintainin g a history database of said expert system 
to record detected faults by component type and 
component location and time of fault for each of 
said switching systems; 
maintaining a multifault database to store on the basis 
of field experience recommendations on the re- 
placement of components in said switching systems 
on the basis of component type and component 
location and time of fault by said expert system; 
accessing said central reporting center by said expert 
system to obtain the identity of one said switching 
systems reporting detected faults; 
opening a maintenance session with said identified 
switching system by said expert system in a similar 
manner as used by a human technician; 
invoking diagnostic procedures on said identified 
switching system by said expert system to gather 
data about said reported faults and the gathered 
data including component type and component 
location and time of fault; 
analyzing said data to determine the severity of each 
of said reported faults by said expert system with 
respect to being transitory and permanent types of 
faults; 

interrogating said history database with the gathered 
data by said expert system to determine the number 
of fault occurrences of each of said components 
having a transitory fault in said identified switching 
system; and 
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recommending replacement of each of said compo- 
nents having said number that exceeds a predefined 
threshold in said multifault database. 

10. The method of claim 9 further comprising the step 
of maintaining a single fault database to store on the 5 
basis of field experience recommendations on the re- 
placement of components for permanent type faults on 
the basis of component type and component location by 
said expert system; 10 

interrogating said single fault database by said expert 
system for each of said components having a per- 
manent fault in said identified switching systems; 
and 

recommending replacement of each of said compo- 15 
nents found in said single fault database. 

11. The method of claim 10 wherein said step of rec- 
ommending comprises the step of displaying informa- 
tion to dispatch a service technician to replace the rec- 
ommended components. 20 

12. The method of claim 11 wherein said step of open- 
ing comprises the steps of dialing a connection via a 


public telephone network to said identified switching 
system; and 

logging on to said identified switching system by said 
expert system. 

13. The method of claim 12 wherein said interrogat- 
ing step of said history database comprises the step of 
clearing each of said transitory fault not having an oc- 
currence in said history database. 

14. The method of claim 13 wherein said identified 
switching system has a control computer including 
duplicated processors with one of said processors ac- 
tively controlling said identified switching system and 
the other of said processors being in a standby condi- 
tion, the method further comprising the step of testing 
said other processor in said standby condition. 

15. The method of claim 14 wherein said switching 
systems are of different manufactured vintages and said 
invoking step comprises the step of requesting from said 
identified switching system the vintage of said identified 
switching system thereby being able to utilize the cor- 
rect, diagnostic procedures. 
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Brief Summary Text - BSTX (6) : 

Modern private branch exchanges ( PBX) use a computer to 
control a switching 

network, PBXs are also referred to as customer switching 
systems or private 

automatic branch exchanges (PABX) . In addition to 
controlling the PBX, the 

computer is continuously running basic diagnostic tests not 
only on itself but 

also on the switching network and communication facilities 
interconnecting the 

PBX to other PBXs and other types of computer systems. In 
addition to 

permanent faults/alarms, these diagnostic tests find many 
transitory faults 

within the PBX . The transitory faults may indicate that a 
component of the PBX 

is marginally faulty or that the PBX's environmental 
conditions have induced a 

failure in the PBX. Such environmental conditions result 
from a variety of 

sources ranging from error conditions on the communication 
facilities to 

electrical noise in the AC power supplied to the PBX at its 
site. Each fault 

occurring on a PBX must be investigated by a service 
technician to determine 

the severity of the fault . When a PBX manufacturer has 
thousands of PBXs to 

maintain in the field, the cost of making such 
investigations becomes enormous. 


Detailed Description Text - DETX (2) : 

FIG. 1 illustrates expert system 102 which embodies the 
principles of the 

invention for performing maintenance on a plurality of PBXs 
(105 and 114) via 

public telephone network 113. Expert system 102 is 
executing on computer 100 

which advantageously may be of the AT&T 63 00 family of 
personal computers . PBX 

114 and 105 (also referred to as customer switching 
systems) are telephone 

switching systems whose telephone switching network is 
controlled by a stored 

program computer. Within each PBX # the computer is 

constantly executing 

diagnostic routines checking for fault conditions. For 
example, within PBX 

105/ computer 122 is periodically running diagnostic 
routines to verify not 

only the state of switch 123 but also the state of the 
central office trunks 

such as DS-1 120, digital transmission facility (DCIU) 124 
and tape unit 121. 

If a fault is detected with one of these units, PBX 105 
records that fault. 

Such a fault is commonly referred to as an alarm or an 
alarm condition and 

results in a call being placed by PBX 105 via the public 
telephone network 113 

to the service reporting center 104. Upon completion of 
the call, computer 122 

transmits the alarm information to service reporting center 
104 where it is 

recorded. The process of recording alarms by service 
reporting center 104 is 

referred to as generating a trouble report or trouble 
ticket . 


Detailed Description Text - DETX (3) : 

Once service reporting center 104 has recorded the 
trouble ticket in its 

internal database, then either a human technician or expert 
system 102 accesses 

service reporting center 104 to obtain the trouble ticket. 
Either the 

technician or expert system 102 accesses PBX 105 via public 
telephone network 


113 to run diagnostic procedures (PROCs) on computer 122 to 
perform a complete 

diagnosis of the state of PBX 105. The accessing of PBX 
105 for this purpose 

is referred to as a session. For the AT&T System 85, 
detailed information on 

the operation of the PROCs is set forth in the manual 
entitled, "AT&T System 

85, Release 2, Version 4, Maintenance." 


Detailed Description Text - DETX (4) : 

After either the technician or expert system 102 has 
initiated a session 

with PBX 105, a listing of the alarms found by computer 122 
is obtained and 

then PROCs are run to perform diagnostic tests and gather 
additional 

information concerning the nature of the alarms. Many 
alarms can be resolved 

through the execution of PROCs and do not require the 
replacement of any parts 

within PBX 105. After finishing the session with PBX 105, 
both the technician 

and expert system 102 generate a report indicating what 
alarms, if any, still 

exist on the system and a recommendation about the 
desirability of sending a 

technician to the site. Expert system 102 also recommends 
what spare parts may 

be needed to resolve the remaining alarms . 


Detailed Description Text - DETX (23) : 

Blocks 310 through 321 determine whether computer 105 in 
PBX 105 is 

duplicated, e.g., has two processors. One processor is 
online and the other 

one is offline waiting to be brought online if the current 
online processor 

fails. If PBX 105 has duplicated processors, it is 
necessary to test both of 

them. Two Procedures (PROCs) perform the duplication 
testing. These PROCs are 

detailed in the aforementioned AT&T Maintenance Manual. 
First, PROC 275 is 

executed by block 309. Decision block 310 checks the 


information returned by 

PBX 105 in response to PROC 275. If no error code is 
returned by PBX 105, path 

332 is followed to decision block 316. If an error code of 
"3" is returned, 

expert system 102 determines that an error may have 
occurred, and block 311 

once again executes PROC 275 in PBX 105. Decision block 
313 checks the results 

of the execution of block 311; and if no error code is 
returned, path 334 is 

followed to decision block 316. If decision block 310 
finds an error code 

other than "3" or if decision block 313 finds any error 
code, paths 33 0 or 335, 

respectively, are followed from these decision blocks to 
block 314. Block 314 

executes PROC 613. The results of the execution of PROC 
613 are checked by 

decision block 319. If no error code is returned, path 340 
is followed to 

block 318 since this indicates that PROC 613 has determined 
that computer 122 

is duplicated. If error code 74 is returned, path 342 is 
followed to block 317 

since this indicates that PROC 613 has determined that 
computer 122 is not 

duplicated. If any other error code is returned, path 341 
is followed to block 

320 which indicates that the state of duplication is 
unknown . 


Detailed Description Text - DETX (31) : 

There are three overall goals that must be achieved in 
the execution of each 

of the diagnostic PROCs . First, expert system 102 
determines the failures that 

caused each alarm by executing diagnostic routines on PBX 
105. Secondly, 

expert system 102 generates a report detailing the results 
of the diagnostic 

routines and indicating the remaining alarms on the system. 

Finally, expert 
system 102 provides recommendations for replacing hardware 
on PBX 105 if 

necessary. An example of such a recommendation is 
illustrated in FIG. 33 where 


it is recommended to replace the tape cartridge of PBX 105. 


Detailed Description Text - DETX (45) : 

FIG. 9 illustrates the logical flow for PROC 650. This 
PROC is used to test 

DCIU 124 of PBX 105. This data transmission facility has 
eight distinct data 

links. Each data link interconnects PBX 105 to a computer 

or communications 

systems. Examples of such systems are voice mail and 
message center systems. 

PBX 105 is connected only to message center 125. First, 
PROC 650 (test 1) is 

utilized to determine which of these data links is failing. 

This information 
is read from PBX 105 utilizing block 904. Block 905 is 
then utilized. to 

execute the diagnostic portion of PROC 650 (test 2) to 
perform transmission 

testing on that data link. Decision block 908 then checks 
if the results of 

the diagnostic portion of PROC 650 indicate that the 
transmission test was 

successful. If so, then blocks 909 and 910 are executed 
via path 922. If the 

transmission test failed, path 923 is followed to decision 
block 911. The 

latter decision block determines if the fault code returned 
by the execution of 

the diagnostic portion of PROC 650 is between 19 and 36. 
If the returned code 

is in that range, then field experience has shown that PROC 
650 should be 

re-executed since those alarms may clear themselves. This 
is performed by 

following path 924 to block 916. If the returned fault 
code is not in that 

range, path 925 is followed to block 912. After 
re-execution of PROC 650 in 

block 916, decision block 917 determines whether the alarm 
has been cleared on 

the second pass. If the alarm has been cleared, path 929 
is followed to block 

918 which performs a function similar to that performed by 
block 909. If the 

alarm has not been cleared, decision block 917 transfers 
control via path 928 


to block 912. This latter block in conjunction with 
decision block 913 

utilizes the "next fault" command of PROC 650 to obtain all 
of the outstanding 

fault information. Once all the fault information has been 
collected, path 931 

is followed to block 914 which marks the alarm as 
"uncleared" and transfers to 
connector 910. 


Detailed Description Text - DETX (46) : 

FIG. 10 is a continuation of FIG. 9. Blocks 1001 
through 1007 obtain 

information to make up a report similar to FIG. 28. Block 
1001 is used to 

execute PROC 256 to determine additional information about 
the data link. 

Decision block 1002 determines whether an error has 
occurred during the 

execution of PROC 256. If there is an error, path 1011 is 
followed to 

connector 702 which results in error processing as 
illustrated in FIG. 7. If 

an error did not occur, path 1012 is followed to block 
1003. Block 1003 once 

again uses PROC 256, but this time specifies the failing 
data link . Further 

information about that link is obtained using the display 
portion of PROC 256 

in block 1004. After execution of this block, the 
specified information is 

read using block 1005. Then block 1006 reads the 
translation tables in the 

memory of PBX 105 to determine the nature of the 
applications assigned to the 

logical channels of the indicated data link to PBX 105. 
For example, the same 

computer can run applications to function either as a 
message center or as a 

telemarketing center. Each application is assigned one or 
more logical 

channels on the physical data link between the computer and 
PBX 105. After 

executing block 1007, connector 811 transfers control to 
the procedures that 
obtain the board type. 


Detailed Description Text - DETX (52) : 

FIG. 17 illustrates the logical flow for printing the 
information gathered 

by the diagnostic PROCs for this session with PBX 105. See 
FIGS. 28 through 3 0 

for an example of this printed information. Blocks 1701 
through 1705 

illustrate the logical flow for printing the information 
for PROC 62 0. Block 

1701 determines if there is any information to be displayed 
for PROC 620. If 

there is no information to be displayed, path 1710 
transfers control to block 

1706. If there is information to be displayed, control is 
transferred to block 

1702 via path 1711. The latter block prints the display 
header and then 

transfers control to blocks 1703 through 1705. These 
blocks display the data 

on 2901 through 2903 of FIG. 29. Similar logical flow as 
used by blocks 1702 

through 1705 is utilized in block 1706 for printing 
information gathered by the 

execution of PROCs 650, 601, 622, 610, and 612 provide for 
PROC 62 0. After 

execution of block 1706, control is transferred to 
connector 1707. 


